Astronomy & Astrophysics manuscript no. 2830 


February 5, 2008 


(DOI: will be inserted by hand later) 





in 
o 
o 

(N 

> 
O 



> 
m 



in 
o 

Or 

6 

CO 
C3 



13 



Automatic classification of eclipsing binaries light curves using 

neural networks 

Sarro 1 , L.M., Sanchez-Fernandez 2 3 , C, and Gimenez 4 , A. 



Dpt. de Inteligencia Artificial, U.N.E.D., c/ Juan del Rosal, 16, 28040 Madrid, Spain 

e-mail: lsb@dia.uned.es Telephone: 00 34 91 3988715 Fax: 00 34 91 3988895 

Laboratorio de Astrofisica Espacial y Ffsica Fundamental, P.O. Box 50727, E-28080 Madrid, Spain 

XMM-Newton SOC, ESAC, P.O. Box 50727 E-28080 Madrid, Spain 

e-mail: Celia . Sanchez@sciops . esa . int 

Research and Scientific Support Department, ESA, ESTEC, Postbus 299, 2200 AG Noordwijk, The Netherlands 
e-mail: agimenez@rssd. esa . int 



Abstract. 

In this work we present a system for the automatic classification of the light curves of eclipsing binaries. This system is based on 
a classification scheme that aims to separate eclipsing binary sistems according to their geometrical configuration in a modified 
version of the traditional classification scheme. The classification is performed by a Bayesian ensemble of neural networks 
trained with Hipparcos data of seven different categories including eccentric binary systems and two types of pulsating light 
curve morphologies. 



1. Introduction 

Eclipsing binaries (hereafter EBs) play a fundamental role in 
modern astrophysics for several reasons. First of all, detached 
double-lined EBs without mass transfer between the compo- 
nents are a prime tool to derive fundamental stellar parame- 
ters; joint analysis of their light and radial velocity curves pro- 
vides accurate (1-2%) determinations of masses, radii, and lu- 
minosity ratios. Eclipsing binaries also work as testing grounds 
for stellar structure and evolution models, and as such they 
play a key astrophysical role across the whole HR diagram. 
Recently, the study of EBs in other galaxies and clusters has 
made it possible to explore stellar evolution and to establish 
mass-luminosity laws for galaxies with a vastly different evo- 
lutionary and chemical histories from our own Galaxy (such 
as LMC and SMC). Moreover, EBs are beginning to play an 
important role in cosmology as distance indicators to nearby 
galaxies. Studies of Galactic early-type binaries have shown 
that distance moduli accurate to + 0.1 mag are attainable, a 
precision comparable to that obtained for individual Cepheid 
variables. As more data are accumulated, studies of these sys- 
tems may lead to an improvement in the extragalactic distance 
scale. 

In recent years, large scale photometric surveys have been 
providing a wealth of light curves of variable stars out of which 
a large amount of EB systems can be selected. For example, 
the ESA astrometric satellite Hipparcos found 70% new vari- 
ables out of the relatively bright selected sample. The GAIA 
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large-scale photometric survey will also have significant sci- 
entific value for the study of nearly all types of variable stars, 
including eclipsing binaries. It is expected that about 1 mil- 
lion EBs, those with V < 16 mag, will be discovered. Even if 
reliable physical parameters could be derived for only 1% of 
the observed EBs, this would be a great contribution to stel- 
lar astrophysics in comparison with what has been obtained so 
far from groun d-based observations. T he Optical Monitoring 
Camera (OMC; iMas-Hesse et all 120031) onboard INTEGRAL 
is another example of an instrument that continuously provides 
high quality photometric measurements of thousands of eclips- 
ing and pulsating variables, amongst other objects more closely 
related to high energy astrophysics. Finally, the COnvection 
Rotation and planetary Transits (COROT) mission will pro- 
duce, as a by product, enormous amounts of ligh t curves of ob - 
jects with unprecedented accuracy (see e.g. lBaglin et all2 002). 
All these vast amounts of data offer the opportunity to se- 
lect not only EB light curves, but all kinds of light curves 
for deeper investigation and/or follow-up. Such databases also 
provide astronomers with powerful heuristics like the possibil- 
ity to construct statistically significant samples of objects that 
can be used as probes for correlations between physical pa- 
rameters, e.g., in the case o f the rotation-activity correlation 
(Jordan & Montesinos, 1991). Nevertheless and despite all this 
encouraging prospects, it is becoming increasingly clear that 
intelligent processing of these large datasets is needed, and no 
method based on manual procedures can be used. It is precisely 
the enormity of this volume of data that makes it necessary 
to implement automatic light curve classification tools before 
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any serious scientific analysis. Fortunately, it is exactly in these 
kinds of tasks (such as pattern recognition, classification, clus- 
tering, and knowledge discovery in the form of dimensional 
correlations) that machine learning and artificial intelligence 
techniques yield their best performances. 

In this paper we concentrate on the applications of neu- 
ral networks for the task of light curve identification and clus- 
tering. Neural networks hav e been widely used in the past 
for classification of s tellar dSnider et all l200ll) and galac- 
tic JFolkes et al., 1996) s pectra, star/galaxy sep aration in im- 
ases dPhilip et all 120021 ICortiglioni et all 1200 lh . or quasar 
detection. Sometimes the classification process takes input 
data spanning a combination of spatial and temporal di- 
mensions as in the case of solar flare detection where time 
series of images are used for the classification of events 
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time series prediction 1 
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identification (Bailer- Jones, 


2000 




Carroll & Staudel l200ll). 


and telescope control ( Sandler et al. 


1991) to cite but a few. 



In the specific field of light curve analysis, neur al networks 
have been used recently for clusterin g purposes jBrett etall 
2004) an d for microle nsing detection (Belokur ov et alJ 12003. 
Belokuro v et alJ 1200 41 Here we present a refined classifier for 
eclipsing binaries based on state-of-the-art neural networks that 
builds upon some of the work presented in these previous de- 
velopments. 

In this work we apply Bayesian techniques to the train- 
ing of neural networks for the automatic classification of light 
curves of variable stars based solely on their morphological as- 
pects. The network is able to recognize four types of eclips- 
ing binary systems and two types of pulsating star light curves. 
Furthermore, all the types define a link between the morphol- 
ogy of the light curve and the underlying physical scenarios 
as much as possible. In Sect. |3 we describe the classification 
scheme in detail; in Sect. [5] we describe the preprocessing of 
the data and the neural network architecture and training; in 
Sect. [5] we describe the results obtained, assess the quality and 
performance of the system and analyse the resulting connection 
topology; finally, in Sect. |4] we summarise the conclusions of 
this work. 

2. Classification scheme 

One obvious requirement of any classification system is the 
direct link between the features used as input and the classes 
defined from them. In the realm of variable systems, unfortu- 
nately, we find that either the classes established up to now 
are not consistently defined in terms of the light curves, as in 
the case of eclipsing binaries, or there are degeneracies, as in 
the case of pulsating stars, in the sense that different categories 
can have morphologically identical light curves. With pulsating 
variable light curves, the degeneracy can only be resolved with 
supplementary spectral information and periods. This problem 
will be addressed in a future paper where a multi-agent expert 
system will be presented, which is capable of classifying pul- 
sating stars (identified by their light curves) by navigating the 
Virtual Observatory space searching for discriminant observa- 
tions. Here we restrict ourselves to the problem of separating 



pulsating variables from eclipsing binary systems and subclas- 
sifying the latter into physically inspired classes univocally de- 
fined in terms of their light curves. 

The two main factors that determine the shape of the light 
curve of an eclipsing binary system are its geometric config- 
uration (i.e. the size of the component stars relative to their 
Roche lobes), which determines the fraction of the light curve 
occupied by eclipses, and the relative brightness of the stellar 
components, which determines the eclipse depths. The inclina- 
tion of the system with respect to the line of sight can affect the 
depths of the eclipses, but its effect on the overall light curve 
morphology is less important. 

We propose here a classification scheme which aims to 
separate eclipsing binary systems according to their geomet- 
rical configuration. This scheme is adapted from the histor- 
ical classification of eclipsing binary light curves into three 
groups (Algol, Beta Lyrae, and W Ursae Majoris), but attempts 
to solve the problems of class heterogeneity and subjectivity 
of the traditional light curve classification, which includes sys- 
tems with different physical properties in the same group. Our 
classification relates the groups established to the geometry, in 
the sense that systems with the same geometrical configuration 
are classified in the same group. 

We note here a previous attempt to solve the degeneracy of 
th e traditional classific ation of eclipsing binaries light curves 
bv lAlcock et alJ ( ll997l) . They proposed a decimal classification 
scheme based on combining the relative radii of the stars and 
the surface relative flux ratio. As an alternative, when only the 
light curve morphology is available, we found that a simple 
4-group classification scheme suffices to separate the systems 
into homogeneous classes. 

2. 1 . Our classification 

Definition of the classes assumes that the light curve has been 
processed such that the phase of the deeper eclipse (we refer to 
primary eclipse) is defined as 0.75. Systems are classified into 
4 groups as follows: 

- Class 1 systems: light curves with well-defined start and 
end to both eclipses. These light curves may present small 
curvatures out of eclipses, but this curvature never masks 
the beginning and end of the occultations. 

- Class 2 systems: light curves with only a well-defined pri- 
mary eclipse, while the secondary has no clear beginning 
or end. 

- Class 3 systems: light curves with eclipses of different 
depth and no flat light curves out of ecplise. In these sys- 
tems, the light curve curvature out of eclipse masks the be- 
ginning and end of the occultations. 

- Class 4 systems: light curves with the equal depth eclipses 
alternating, and no flat light curve out of eclipse. 

Figure ^ shows example light curves from the Hipparcos 
catalogue for each class. 
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Fig. 2. System parameters grouped by class as defined in the 
text. Orbital periods (days) and separations (expressed in solar 
radii) are shown in the left column of the plot; total mass of the 
system (expressed in solar masses) and mass ratios are shown 
in the right column plots. 

2.2. Application to a sample of systems 

In order to show the relation between the classes established by 
this scheme and some of the system parameters, we classified 
a set of 8 1 binary systems with well-studied light curves and 
precisely determined physical parameters. The list of systems 
used in this study and their main physical parameters can be 
found in Tables 

In the following, we will analyse the classes in terms of the 
component masses, orbital separation, mass ratio, and filling 
factors of the 8 1 systems included in our sample. In order to 
help with the interpretation of the combination of any two such 
parameters, we first show in Fig.|2]the sample masses, orbital 
separations and, mass ratio for the systems classified in each 
class. We can see in the total mass plot that there is no discrim- 
inant boundary or general trend between classes, although type 
4 systems seem to be characterized by a lower mass. Orbital 
separations, on the other hand, show a decreasing trend towards 
higher types. We see how an apparent segregation in the total 
masses of type 3 systems into two sets (low and high mass ob- 
jects) is reproduced in the separation plot in the sense that the 
less massive systems also have lower orbital separations and 
vice versa. This segregation into two groups may be an artifact 
caused by a limited sample size. Thus, more systems of this 
class with accurately determined parameters are needed to clar- 
ify whether two different populations with similar light curves 
indeed exist or whether there is continuous transition. Finally, 
the mean value of the mass ratio of the components shows val- 
ues closer to 1 for type 1 systems, lowest values for type 2, and 
increasing values thereafter (types 3 and 4). 

If we now plot the radius to orbital separation ratio for both 
components of each system as a function of the mass ratio q 
(defined as q — M2/M1, Mi being the most massive star), we 
obtain Fig.[3]and|3]where we have also included the Roche lobe 
size (in orbital separation units) computed using the approxi- 



Fig. 3. Radius to orbital separation ratio of the primary compo- 
nents in the sample as a function of the mass ratio q 



mation of Egg letonllll983l) . Figure|3]clearly shows that primary 
components of type 1 and 2 systems are well below the Roche 
lobe radius, while type 3 primaries are close to it, and type 4 
primary stars clearly fill their Roche lobes. At the same time, 
only radii of type 1 system secondary stars are clearly below 
the Roche limit. Under this perspective, it is evident that our 
classification scheme is a morphological transposition of the 
different geometrical configurations: type 1 systems are com- 
posed of two stars with radii clearly below the Roche lobe (de- 
tached systems); type 2 systems are composed of a primary star 
well below its Roche limit and a secondary filling its Roche 
lobe (i.e. semidetached systems); type 3 systems have a pri- 
mary component close to filling its Roche lobe and a secondary 
component already filling the critical lobe and therefore, they 
represent semidetached systems close to contact; finally, type 
4 light curves represent contact binaries with both components 
filling their Roche lobes and possibly exceeding them. We will 
pursue further the implications of this scheme after consider- 
ing possible correlations between total mass, orbital separation 
and, mass ratios. 

Figure [5] represents all systems in the sample in the 
log(M, ,)-log(a) space, with M to , the system total mass and a 
the orbital separation in solar radius units. Although there is 
clearly no separability in this space, there are evident trends 
in the data. Again, type 4 systems are found in the low orbital 
separation and low total mass region of the plot, and seem to 
follow a tight linear relation. The rest of the types continue this 
correlation with increasing values of the dispersion: low mass 
type 3 systems follow the trend to the right with higher val- 
ues of both parameters, then type 2 systems, and finally, with 
a high degree of overlapping, type 1 systems occupy the high 
total mass, high orbital separation region of the plot. 

Once revised, the physical characteristics of the proposed 
classes, we can reformulate the definitions, this time summa- 
rizing the regions of parameter space where we can expect to 
find the system. 
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ing phase, and the secondary component (that originally trans- 
ferred a significant fraction of its mass through the L\ Lagrange 
point ) has spectral type G or K and has a mass around or be- 
low the solar mass. These systems have their origin in detached 
systems in which the most massive star evolved out of the main 
sequence, filled its Roche lobe, and transferred mass to the sec- 
ondary component through L\ until the original secondary be- 
came the most massive component. Although the details of the 
process remain unclear ( lHallL [l975l IZiolkowskil 1 1 976l) . system 
mass loss cannot be discarded. 



2.2.3. Systems with type 3 light curves 

Type 3 systems in the sample show two different clusters in 
parameter space. They are all semidetached systems with the 
primary component close to filling the Roche lobe and, due to 
the proximity of the components, both eclipses alternate with- 
out intereclipse flat intervals. 

The most populated cluster is composed of systems with 
total masses lower than 5 M and is characterized by short pe- 
riod (less than 1 day) orbits and small orbital separations in 
close-to-contact configurations. The primary components are 
spectral types A or F, and the secondary stars are one or two 
types cooler. 

The second, less populated cluster of systems corresponds 
to total masses above 10 M and periods in the 1-3 days range. 
They show moderate mass ratios, and both components are 
of similar spectral types around B. Again, the more massive 
primary component is close to its Roche lobe but separated 
from it, and the evolved secondary is in contact with its lobe. 
They probably originated in very close orbits, with mass ratios 
around 1, and evolved to near contact configurations as the stars 
expanded as a consequence of Main Sequence evolution. 



Fig. 5. Logarithmic representation of the eclipsing binary sys- 
tem total mass (M to t) as a funtion of the orbital separation a in 
solar radius units. 

2.2.1 . Systems with type 1 light curves 

These are detached systems with widely varying total masses 
and a wide range of spectral types from O to F. Most of the 
systems assigned to this class have mass ratios close to 1 due 
to selection effects. Both components are well within the Roche 
lobe and have orbital separations in the 10-100 R range. All 
these properties result in light curves with well-defined begin- 
nings and ends of both eclipses and flat regions outside them. 
These binaries are the best source of information to study stel- 
lar absolute dimensions and structure. Most systems included 
in this group are eccentric. 

2.2.2. Systems with type 2 light curves 

Systems classified as type 2 have low mass ratios and to- 
tal masses below 1OM . In the systems studied, the primary 
component of spectral type A or B is in the Hydrogen burn- 



2.2.4. Systems with type 4 light curves 

All light curves classified as type 4 correspond to systems in 
contact where both stars fill and possibly exceed their Roche 
lobes. If they exceed this limit, the size of the common en- 
velope depends on the most external contact surface. These 
systems are characterized by short periods, small orbital sep- 
arations, and a wide range of mass ratios. Except for RZ Pyx 
with a spectral type B, the rest of the systems are composed 
of late type stars with total masses below 3 M Q . It is not yet 
clear whether these systems are formed as contact binaries or 
if they evolve from detached systems through loss of angular 
momentum. Most possibly, the population of contact binaries 
is a mixture of both evolutionary paths. 

To finish this section we would like to point out that, un- 
fortunately, our scheme is not without degeneracies or cross- 
class contamination. The main sources of contamination arise 
from pre-main sequence detached systems. In these systems, 
one of the components is in the contracting phase towards the 
Main Sequence while the second component has already sta- 
bilized in it. The former is far dimmer than the latter and can 
thus be a source of confusion with type 2 systems despite their 
detached geometry. Due to the relative youth of these pre-main 
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sequence systems they generally have not had enough time to 
circularise their orbits and show therefore some degree of ec- 
centricity. This criteria can be used to place them correctly in 
the type 1 group. Nevertheless, certain orientations of the or- 
bit with respect to the observer may result in eclipses being in 
quadrature despite the eccentricity of the system. 

3. Results and discussion. 

In order to assess the performance of the ensemble of neural 
networks thus generated, we divided the whole set of examples 
into two groups: (i) a training set used to obtain the a poste- 
riori probabilities of the parameter sets generated by MCMC 
methods (75% of the complete set), and (ii) a test set used to ob- 
tain estimates of the expected cross-class misclassification rates 
(25% of the complete set). In order to approximately maintain 
the relative size of each class in the complete set, a light curve 
is assigned to the training set with a 0.75 probability or to the 
test set with a 0.25 probability. This splitting is performed 10 
times and the resulting blocks considered separately. Errors in 
the performance estimates correspond to the root sum square of 
the performance of the ten partitions. Furthermore, three differ- 
ent network architectures are tested. Invariably, all three archi- 
tectures have a 50-unit input layer and a 7-unit output layer. 
They differ on the presence/absence of one or several hidden 
layers. The first network is a logistic regression network (with 
no hidden layer); the second network has one hidden layer with 
30 units; and the third network architecture contains two hid- 
den layers of 20 and 10 units, respectively. Each splitting of 
the complete set is used to generate 1000 networks of each ar- 
chitecture, and the last 200 are used to predict classes for the 
light curves in the corresponding test sets. Thus, we end up with 
10x3 ensembles of 200 neural networks. In addition to this, the 
effectiveness of the ARD procedure was assesed by comparing 
the predictions on the test sets of networks of the same archi- 
tecture with/without ARD implemented in the training process. 

In order to avoid unnecessary computations, we checked 
the average error percentage for each architecture and found an 
8.7% + 3.6 for the 50-7 architecture, 6.9% + 1.3 for 50-30-7 
and 6.9% +1.3 for 50-20-10-7. The average log probability of 
the test cases was -0.24 for the 50-7 network, -0.17 for the 50- 
30-7 network, and -0.18 for the 50-20-10-7 network. Although 
ARD could naturally prune unnecessary units and connections, 
if hyperparameters were introduced in all network layers, we 
preferred to continue the analysis with the 50-30-7 architecture. 

The performance of the neural network does not depend on 
the total number of measurements in the light curve. It would 
indeed depend on the total information content of the avail- 
able points (note that the total information content combines 
information not only on the phase coverage but also on the 
relevance of the covered phases for classification purposes), 
if no pattern completion were carried out during the prepro- 
cessing stage. This can be seen by taking the extreme case of 
an infinite number of points concentrated on a very narrow 
phase interval where all classes present the same behaviour. 
But, as explained above, the preprocessing stage completes the 
missing bins using the curvature of the closest light curves in 
the SOM. Therefore, if the completion process is correct and 



the initial incomplete light curve has enough information to 
reconstruct the missing phases, no dependence of the neural 
networks performance on the information content of the light 
curve before preprocessing should be detected, which is in fact 
the case down to the minimum information content found on 
Hipparcos light curves; around 20%, 100% is a complete light 
curve. Unfortunately, this robustness is not realistic since only 
a 10% of the catalogue has information content below 60%, 
and therefore the statistics are rather poor. The study was car- 
ried out grouping the light curves in bins of information content 
width 20%. The smaller number of cases in each bin increases 
the standard deviation up to 4.3%. 

We also investigated the dependence of the classifier per- 
formance on the ratio between the amplitude and the errors in 
the measurements of the light curve (the signal-to-noise ratio) 
and found no significant trend above a mean variance of 5.7%. 
Again, the preprocessing stage tries to minimize the effect of 
the errors in the measurement by means of the regression pro- 
cess. It has to be beared in mind that it is actually a smooth 
curve (the result of the second regression) that is used as input 
to the neural net. 

The robust performance of the neural network described 
in the preceding paragraphs can also be expected for light 
curves in the same information content and signal-to-noise ra- 
tio ranges as those found in the Hipparcos catalogue. As men- 
tioned above, this implies light curves with information con- 
tents above 60% (although the classifier shows the same per- 
formance down to a 20% with only a few tens of light curves to 
compute the means). Light curves with lower information con- 
tents can possibly be mistankenly completed if not enough in- 
formation is available for a reliable completion. Regarding the 
signal-to-noise ratios, we have found that 98.5% of the light 
curves in the catalogue have ratios above 5cr. 

Finally, we studied the performance of the neural network 
as a function of the number of bins used as input and found 
that 50 bins lies in a plateau with similar performances that 
goes from 40 bins up to 90 bins. Below 40 bins the degrada- 
tion is first due to the misclassification of eccentric systems 
and below 20 bins mainly to confusion between types 1 and 
3. Above 90 bins, there are not enough examples to construct 
the relationship between each input node and the class and the 
performance curve begins a slow decline as expected. 

Table^shows the average cross-class misclassification per- 
centage and the standard deviation computed for the 10 differ- 
ent splittings of the complete set for the 50-30-10 architecture. 
Each row lists the percentage of objects of a given type that 
have been misclassified in all other possible categories. 

These percentages are less than 1 point lower on average 
than obtained without ARD implemented during the training. 
From this point onward, and having an estimate of the ex- 
pected misclassification rate, we continue the analysis of the 
performance of our classification system with the complete set 
of Hipparcos plus synthetic eccentric light curves as training 
set of a 50-30-7 architecture network with ARD implemented. 
Although this particular choice is only marginally justified in 
terms of classification performance, we consider that it pro- 
vides best results with maximum information. Figure [6] shows 
the mean values of the third level hyperparameter controlling 
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Table 1. Cross-class misclassification percentages. Each row lists the percentage of light curves of a given class that were 
mistakenly classified as belonging to the corresponding type in the row of headers. 





Type 


Type 1 


Type 2 


Type 3 


Type 4 


Type A 


Type B 


Type 




3.6 ± 0.6 


0.5+0.3 


0.3+0.3 


0.0+0.0 


0.0+0.0 


0.0+0.0 


Type 1 


1.0+0.3 




1.5+0.4 


4.9+0.7 


1.6+0.3 


0.0+0.0 


0.0+0.0 


Type 2 


0.2+0.2 


4.0+0.9 




0.0+0.0 


0.0+0.0 


0.0+0.0 


0.0+0.0 


Type 3 


0.0+0.0 


8.8+1.7 


0.0+0.0 




5.3+1.1 


0.0+0.0 


0.0+0.0 


Type 4 


0.0+0.0 


2.1+0.7 


0.0+0.0 


7.4+1.0 




0.0+0.0 


0.0+0.0 


Type A 


0.0+0.0 


0.0+0.0 


0.0+0.0 


0.0+0.0 


0.0+0.0 




13.4+1.7 


Type B 


0.0+0.0 


0.2+0.1 


0.0+0.0 


0.0+0.0 


0.0+0.0 


3.2+0.4 
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Fig. 6. Mean values of the hyperparameter controlling the av- 
erage magnitude of the weights out of the input unit for the 
Hipparcos set plus 112 synthetic light curves of eccentric sys- 
tems. 



the average strength of the connections out of each input unit, 
obtained in the last 1500 networks with ARD implemented. 

It shows non negligible values at all phase bins although the 
average strength of neural connections from input units is seen 
to display some degree of structure. On a background of low 
magnitude weights, we find a higher concentration of sensitive 
units around phase 0.25. This can be easily understood in terms 
of the classification criteria exposed in Sect.[2]and the prepro- 
cessing of the light curves, the combination of which makes the 
class assignment decision depend mainly on the properties of 
that region. The relative importance of the connection strength 
of synapsis out of the unit representing phase <p - 0.75 can 
be explained under the ARD framework as the result of the 
MCMC methods blindly exploring the hyperparameter space 
with a probability given only by the prior. This unit conveys 
no information at all (the preprocessing stage fixes its input at 
1 .0) and therefore, we can expect its posterior probability to 
be roughly equal for all possible values of this hyperparameter. 
The final value is simply an average of this blind exploration of 
the prior. The low sensitivity of input units away from <p = 0.25 
can be explained by the easy separability of very eccentric sys- 
tems with respect to all other classes. 



4. Conclusions 

In this work we present an automatic light curve classifier based 
on neural networks able to separate pulsating stars from eclips- 
ing binary systems. We classified the latter into 4 groups, ac- 
cording to a new classification scheme based solely on the mor- 
phological features of the light curve, which maps the system 
geometrical configuration. We applied the new classification 
scheme to a sample of 81 systems with well-measured light 
curves and well-determined physical parameters, to investigate 
the physical properties of the classes thus defined. We found 
that, based only on the light curve morphology, we are able to 
separate systems with different geometrical configurations. 

From a technical point of view, the improvement of our 
classification scheme with respect to the traditional one relies 
mainly establishing well-defined and objective criteria that can 
be easily implemented on a neural network. The traditional 
classification was not systematically formulated and was sub- 
jectively applied after visual inspection of the light curve by 
the observer. From a physical point of view, our classifcation 
scheme improves the traditional classification by establishing 
classes characterized by the variation of the system geometry 
from one group to the other. In the traditional one, systems with 
different geometrical configurations were classified in the same 
group. 

We also considered under what circumstances the classifi- 
cation scheme proposed here would fail a priori to reflect the 
underlying geometrical configuration and found the following 
two exceptions: 

- Pre-main-sequence systems with low luminosity secon- 
daries and semimajor axis aligned with the visual line. 

- Semi-detached systems with a secondary component in 
contact with the Roche lobe and a primary close to contact, 
being both stars of the same or similar luminosities. 

We explored the classifier performance when trained with 
Hipparcos examples alone and together with a set of light 
curves artificially generated to increase the relative frequency 
of eccentric systems in the training set. In the latter case we 
found a significant improvement in the classifier's ability to de- 
tect eccentric binary systems at the expense of a small degrada- 
tion in the overall performance. We explored several architec- 
tures for the network and found improved performance for net- 
works with one or more hidden layers (with negligible differ- 
ences between them). Finally, we also found negligible differ- 
ences between the performance of classifiers trained with and 
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without ARD. Almost the entire set of misclassifications occur 
at the boundaries between classes mainly due to the nonsepa- 
rability of the sets of examples. We atribute this to the presence 
of noise in the training set. Nevertheless the softmax formula- 
tion of the model provides a quantitative measurement of the 
confidence in the class assignment in such cases, because sys- 
tems in the proximity of a boundary between two classes ex- 
hibit comparable values in the output of the neurons that label 
those classes. 

We have compared the performance of Bayesian neural net- 
works presented above with that of a simple multilayer per- 
ception and found an overall improvement of 12.1%; i.e., the 
percentage of right classifications of a 50-30-7 multilayer per- 
ception is 19.0%. These figures combine both the improvement 
in the regression stage and that in the final classification. The 
inclusion of wide priors for the hyperparameters leads to in- 
creased robustness when outliers are expected. In our case, we 
have experienced that the results of regression with simple mul- 
tilayer perceptions, in the presence of outliers to the light curve, 
are significantly worse than those of Bayesian neural networks. 
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Fig. 1. Examples of the classes defined in the text obtained by Hipparcos, folded with the periods provided in the mission 
catalogue. 
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5. Neural classifier 

As mentioned in the introduction, the final aim of this work is to 
make the computational classification of automatically prepro- 
cessed light curves possible without human supervision. The 
classification system defined in the previous section was de- 
signed to accomplish this goal, while at the same time preserv- 
ing the physical significance investigated there. In this section, 
we describe the methodology used to implement the classifier 
and the results obtained as assesed using standard techniques 
in the field of connectionism. 

5.1. Bayesian training of neural networks 

Most connectionist methods consist of distributing the com- 
putation of the solution of a given task amongst a number of 
interconnected, formally equivalent units or neurons perform- 
ing very simple nonlinear operations upon the weighted sum of 
their inputs. The connection topology divides the ensemble of 
neurons (the neural network) in layers with forward connectiv- 
ity. This architecture is commonly known as a multilayer per- 
ceptron. Although there are several other architectures and dif- 
ferent local operations from the one sketched above, the multi- 
layer perceptron is by far used the most for classification tasks. 

The most popular way to adjust the free parameters (the 
strength or weight of the synapses), in order to teach the neural 
network to accomplish the d esired task, is th e error backpropa- 
gation algorithm by Rumelh art et alJ (11986). which consists of 
exploring the error hypersurface by following the reversed lo- 
cal error gradient. By presenting the network with a series of 
examples for which known desired outputs are available (the 
training set), the local gradient of the total error with respect to 
the connection weights can be computed and the weights cor- 
respondingly updated. There are several techniques to achieve 
generalization, understood as the ability of a network to imprint 
in its weights the abstract rules for classification implicit in the 
training examples, disregarding at the same time the particular 
details of the examples used. Again, the most common prac- 
tice consists of dividing the available set of examples into three 
groups: a training set, a validation set and a test set. Learning 
proceeds by minimizing the training set error while at the same 
time monitoring the validation error. When the network has 
captured the general rules for classification and started to in- 
corporate the particular details of the training set, the validation 
error reaches a minimum while the training error continues de- 
creasing. It is at this minimum point that learning is stopped, 
in order to avoid overtraining, and the error of the network is 
estimated using the error set. There are multiple variations to 
this very basic scheme, but most of them end up in the vicinity 
of a local minimum of the error hypersurface which we expect 
to be the global minimum. 

Here we deviate from the common practice and use a dif- 
ferent formalism, which we consider more flexible and sound: 
Bayesian training of neural networks. In the Bayesian frame- 
work, instead of a class assignement we obtain a predictive 
probability distribution. Let denote the set of parameters 
needed to fully specify a neural network architecture (i.e., all 
the connection weights between neurons in the network). The 



network class prediction C„+i for a new test case x n+ \ given a 
training set Strain 15 computed as 

P(C„+i |*„ + i, .S train ) = j P(C n +l\x n+1 , 0) ■ P(0|Strata) ■ dO, (1) 

that is, an average of the predictions P{C„+\\x n+ \, 0) made by 
networks covering the whole parameter space, weighted by 
the posterior probability of given the training set. The ex- 
pression P(C„ + i|*„ +1 , Strain) is a probability distribution for all 
possible classes to which x n+ i can be assigned, or, equivalently, 
for all possible values of C„+\. The a posteriori probability can 
be computed by applying Bayes theorem 

P(0\S trai „) cc P(S train \6) ■ P(0), (2) 

that is, as the product of the likelihood function and the prior 
probability of the network parameters. Once this probability 
distribution is obtained, single value predictions can be ob- 
tained by minimization of loss functions, such as squared er- 
ror loss (equivalent to guessing the mean) or absolute error 
loss (equivalent to guessing the median) or, as in our case, 0- 
1 loss functions more suitable for classification tasks (equiv- 
alent to guessing the mode). The integral in Eq. is de- 
fined over all parameter space. In the case of neural networks, 
this integral is unmanageable without the aid of special tech- 
niques developed for solving similar problems in the context 
of theoretical physics. In this work we make use of hybrid 
Monte Carlo techniques implemented in t he so ftware pack- 
age Flexible Bayesian Methods by Neal 09961) . These are 
used to approximate the integral by a sum of terms of the 
form P(C„+i\x n+ i, <n) ), where all the sets of weights ( " } follow 
the probability distribution P(0\S, ra i„). The likelihood function 
P{S,rain\0) can be a Gaussian function for regression networks 
which incorporate noise in the width of the gaussian, or a soft- 
max model for classification purposes. A full description of 
Markov Chain MonteCarlo (MCMC) techniques is clearly be- 
yond the scope of this paper. We refer the inte rested reader 
to classical expo sitions of the method, such as iNeall {1 996) 
or lBishopI d 1995b . We mention here that the method achieves 
equilibrium in the target statistical distribution only after a cer- 
tain number of networks N eq have been generated. Therefore, 
in general, only networks created after N eq are used to estimate 
the integral. 

This approach presents several advantages over traditional 
maximum likelihood methods such as error backpropagation. 
The main advantage stems from the fact that predictions are not 
formulated in terms of a unique network but as an average over 
all networks. Those networks with larger a posteriori probabil- 
ities contribute more to the average than the rest implying that 
it is no longer necessary to limit the model complexity. 

As mentioned above, in classical backpropagation training, 
model complexity is usually limited in order to avoid over- 
training because complex models can incorporate increasingly 
complex features of the input space, including random noise in 
the training set. For each training set size and statistical dis- 
tribution of patterns there is an optimal model complexity that 
is usually sought by cross-validating the training performance 
with an independent set of examples called a validation set. By 



Sarro 1 , L.M. et al.: Automatic classification of light curves., Online Material p 3 



stopping the learning algorithm at the minimum of the valida- 
tion error curve, we are effectively limiting the average norm 
of the weight v ectors, thus limiting model complexity (see e.g. 
iBishonl 1 19951) . This is at the expense of reducing the avail- 
able set of training examples to create the validation set. In the 
Bayesian framework, on the contrary, if the model and prior 
probabilities (henceforth priors) are appropriate, the inferences 
are right independent of the training set size, thus eliminating 
the need for cross validation and for the reduction of the train- 
ing set implied by it. 

One of the main advantages of Bayesian training of neu- 
ral networks is the possibility of includincluding hierarchical 
priors in Eq. 0- It allows automatic relevance determination 
(ARD) of the parameters by introducing correlations amongst 
groups of parameters, in particular, amongst the set of weights 
connecting a given input unit with neurons in the next layer. 
A prior specification for the network parameters can be ex- 
pressed as the product of several independent fully specified 
probabilities (one for each connection weight at the lowest 
level) or as the integral of a more general probability distribu- 
tion that applies to the connection weights of a given unit and 
that is characterized by new sets of parameters. Because these 
newly introduced parameters directly dictate not the weight 
probability density but the probability distribution of the pa- 
rameters that describe it (i.e., that describe the weight probabil- 
ity density), they are called hyperparameters. In the first case, a 
Gaussian prior with fixed mean and width can be used directly 
for the probability distribution of a given connection weight. In 
the second approach, this probability distribution of weights in 
the network would be the result of averaging over all possible 
means and widths (hyperparameters) weighted by their respec- 
tive prior probabilities: 



P(0)= P(e\y)-P{y) (3) 



where, in the example, y is the vector of hyperparametric 
means and/or widths that is common for all synaptic weights 
out of the neuron. P(y) in turn can be fully specified or else 
given in terms of new hyperparameters at the next level of neu- 
ral units in the same layer. By using different levels of hyperpa- 
rameters from the bottom levels of single connection weights 
or single unit weight sets up to the highest level of layers or 
the entire network, correlations amongst parameter sets of the 
same group can be introduced in the integral of Eq.©. 

This scheme, when applied to the input-hidden connection 
weights, can be used to test the relevance of input variables for 
the classification task. If a given input variable is not relevant 
for classification purposes, under very special circumstances it 
may worsen the network performance. By making use of these 
hierarchical priors, we can effectively remove noninformative 
input units simply because a large fraction of the parameter 
space with significant contributions to the integral on the right 
hand side of Eq. Q will come from networks with their con- 
nection weights set to zero. 



5.2. Preprocessing of Hipparcos training patterns 

We have used light curves from the Hipparcos catalogue as 
training set. As usual with neural networks, the raw data (origi- 
nally in the JD-V magnitude space) need to be preprocessed to 
optimize the performance of the network. The preprocessing of 
light curves consists of several distinct stages briefly summa- 
rized here to serve as a guide for the following explanations. 

1 . Unit conversion and binning 

2. Pattern completion 

(a) Pattern regression 

(b) Normalization 

(c) SOM consultation 

(d) Second order interpolation 

3. Pattern regression 

4. Normalization and phase-shifting 

5.2.1. Unit conversion and binning 

First, original JD-V magnitude light curves from the Hipparcos 
archive are extracted and observations with bad quality flags 
removed. Then, the time coordinate is converted into phase ac- 
cording to Hipparcos ephemeris, if available. Otherwise, liter- 
ature ephemerides provided with the catalogue are used. The 
resulting light curve is binned into 50 phase intervals between 
and 1 corresponding to A(f> = 0.02. In a high fraction of the 
catalogue, light curves contain gaps in the phase coverage of 
the binary cycle. 

The analysis in terms of the information content (IC) of the 
inputs supports the choice of 50 bins as a compromise between 
maximum possible resolution with manageable sizes. As a rule, 
very narrow phase bins can preserve finer details of the light 
curve. In practice however, there is a limit above which fine 
details convey no useful information for the classification task. 
We found that the smallest detail necessary for the classifica- 
tion task in our classification scheme was given by the eccen- 
tricity definition (see Sect. 15.31 1. At a resolution of A<p = 0.02, a 
system is classified as eccentric if the secondary eclipse is more 
than two bins/input units away from phase <p = 0.25. 

Bearing that in mind, we studied the information content 
distribution along the light curve. We defined the information 
content of a given light curve as the sum of the mutual infor- 
mation content of the measured phase bins and the class. The 
mutual information / between two random variables X and Y 
was defined as 

I{X, Y)=V p(x, y) log 2 /** ' y) . (4) 
p(x) ■ p(y) 

We have computed the mutual information between each of the 
50 phase bins (X,, 1 < i < 50) and the class category (Y). The 
resulting distribution is shown in Fig. Equivalent plots at 
higher resolutions (i.e., with smaller A<p) do not change this 
smooth curve. 

This plot shows that at a resolution of 50 bins, the mutual 
information curve is smooth and intuitively reflects the knowl- 
edge needed by a human classifier. At much higher resolu- 
tions (which do not convey more information), there are not 
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Fig. 7. Normalized mutual information between each of the 
phase bins and the class. 

enough examples to characterize the relationship between each 
bin mean and the class, and the performance of the neural net- 
works degrades. At lower resolutions, important information 
is lost and, again, the performance degrades, especially when 
identifying eccentricity. 

5.2.2. Pattern completion 

Although neural networks are characterized by a high fault 
tolerance that can be assimilated to the presence of gaps in 
the phase coverage, we observed improved performance of the 
classification network when these gaps are interpolated accord- 
ing to the procedure presented below, mainly in cases where 
incomplete phase coverage is worsened by the presence of sig- 
nificant noise and/or outliers. Therefore, chose to interpolate 
data in the gaps by applying the following approach. The in- 
complete light curve is presented to a Self Organized Kohonen 
Map (SOM) constructed with the best and most representative 
721 light curves of the Hipparcos catalogue, including pulsat- 
ing variables (see below). These were chosen to cover as many 
morphological features as possible with low noise levels and a 
phase coverage such that a simple spline interpolation allows 
reliable recovery of all morphological information. Following 
presentation of the incomplete light curve to the SOM, the map 
of neural activity is searched for the most similar light curve of 
the map that uses the Euclidean distance criterion. 

The SOM was created using the standard software package 
SOM_PAK prepared by the SOM programming team of the 
Helsinki University of Technology. The map had dimensions 
10 by 8, it was trained 100 times with different initializations, 
and the lowest quatization error map was saved for subsequent 
use. During training, Gaussian neighbourhood functions were 
used. The automatic procedure implemented in SOM_PAK for 
the search of such minimum error nets implies a random ini- 
tialization and two training stages: during the first 1000 cy- 
cles, unit vectors are ordered in a process whereby the neigh- 
bourhood radius decreases from values close to the map size 
down to unity, and the learning rate decreases from 0.05 to 



zero. During the next 10000 cycles, unit vectors are fine-tuned 
to minimize quantization error by training with smaller rates 
(starting at 0.02) and neighbourhood radii (starting from 3.0). 
The choice of the map dimensions is justifi ed in t erms of the 
Sammon mapping of the input set (Sammon, 1969). 

Presentation of an incomplete light curve to the SOM re- 
quires adequate preprocessing. In this case, the preprocesing 
consists of normalizing the light curve considered as a 50- 
component vector to unit length, as done with the map creation 
vectors. The reason for this is that a SOM operates by comput- 
ing the scalar product of the input vector and each of the neural 
codebook vectors, thus constituting a morphological similarity 
detector. Therefore, it is necessary to scale the input's incom- 
plete light curve vector to a length at least close to the one 
used for the codebook vectors. Unfortunately, the normaliza- 
tion constant of an incomplete light curve will be smaller than 
if it were complete, by a factor that depends on the gap total 
length and the precise phases missing from the curve. Thus, 
in order to properly normalize the incomplete light curve, we 
need the same phase bins that we want to retrieve from the 
SOM. To overcome this difficulty, the original data previous 
to the phase-binning process are regressed using a set of neu- 
ral networks obtained under the Bayesian framework described 
above. 

The regression network is indeed a set of networks, the pa- 
rameters of which follow the distribution function P(9\S tra in), 
with Strain the set of points in the light curve. This set of net- 
works is generated by specifying priors with hyperparameters 
for input-to-hidden weights, hidden biases, hidden-to-output 
weights, and output biases. 

The prior specification used for the output bias is a 
Gaussian prior with a mean of zero and standard deviation 
10. For input-to-hidden weights and hidden biases, a Gaussian 
distribution is used with zero mean and variance given by a 
gamma distribution of mean equal to 2.0 and a = 0.5, where 
alpha is the shape parameter. Finally, the hidden-to-output 
weights are given Gaussian priors with mean equal to 3.0 and 
a - 0.5. These weights are automatically rescaled based on the 
number of hidden units so that the effect is independent of the 
hidden layer size in the limit of large numbers. Again, we re- 
fer those readers intere sted in the details of this method to the 
classical exposition by lNealNl996l) . 

This regression network is then used to interpolate the miss- 
ing gaps and the result is used as the basis for computing the 
normalization constant. It is important to notice that the regres- 
sion is only used for normalization purposes. The query to the 
SOM is made with the incomplete light curve, previously nor- 
malized to unity with the length derived from the light curve 
completed by regression. 

Once the SOM has been consulted and the resulting neural 
activity map searched for the closest exemplars, these are used 
to fill in the gaps of the incomplete input. The process modifies 
the zero and first-order terms of the retrieved exemplar in the 
missing phase interval (i.e., adds a linear function) to match 
the limiting data, and it uses only higher order curvature terms. 
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The correction applied to the values of the retrieved exemplars 
in the phase gaps are given by 



Vt = Vf x ■ (a + m ■ (<p a - 4>n)) 
where 

m = (ft- a)l(<p a - 
a = Vn/V'f 



(5) 



(6) 
(7) 
(8) 



and il is the subindex of the last sampled phase bin before the 
gap, i2 is the subindex of the first sampled phase bin after the 
gap, Va and Va are the values of the average magnitudes mea- 
sured in the corresponding phase bins of the incomplete curve, 
and V" and V" the magnitudes in bins il and z'2 of the exem- 
plar curve. 

5.2.3. Pattern regression 

The result of the SOM-based pattern completion (V magnitudes 
for phase bins not sampled) is added to original data obtained 
from the catalogue before binning in phase. The completed 
light curve is regressed again using a second set of neural net- 
works totally equivalent to the one used in the pattern comple- 
tion stage, and the result is used to obtain an equispaced light 
curve of, again, 50 bins. 

5.2.4. Normalization and phase-shifting 

Finally, the result is shifted in phase, to make the minimum of 
the light curve (maximum magnitude) coincide with (f> = 0.75, 
and rescaled in magnitude to match the [0, 1] interval. This final 
product is used both in the training of the classification network 
and as input to the trained classifier. 

5.3. Classification 

A total number of 1722 light curves were used for the train- 
ing of the network. The relative size of each group of curves is 
given in Table |5] In it, we split the set of Type 1 light curves 
to create a new group of systems (named Type 0) with essen- 
tially the same detached configuration made explicit in Sect. 
13 but with A(f> between maxima greater than 0.29 or less than 
0.21, i.e., systems with clearly detectable eccentricies with the 
phase bin width used in the preprocessing. Of the 1722, 1610 
were directly taken from the Hipparcos catalogue, while the 
remaining 112 are synthetic light curves of eccentric systems 
covering all possible A<p between eclipses in steps of one phase 
bin, and depths of the secondary eclipse relative to the primary 
of 100, 80, 60, and 40%. This addition to the basic Hipparcos 
training set was included to improve the poor performance of 
the classifier as an eccentricity detector when trained only with 
Hipparcos light curves, basically due to the scarcity of eccen- 
tric binary systems. Given that eccentric binaries only represent 
a small percentage of the total 1610 light curves, the overall 
performance of the network did not improve (it even degraded 
from a 6.1 % average misclassification rate to 6.9 %), but the 



misclassification rate that was restricted to the eccentric sys- 
tems lowered from an average 20% down to 4%. The advantage 
introduced by the new class of eccentric systems is the possi- 
bility of using this classifier as a first stage in the automatic 
generation of lists of pre-main sequence binary system candi- 
dates in which the light curve information is combined with 
spectral or colour data. 

Besides splitting Type 1 systems, the table includes two 
new groups of light curves corresponding to pulsating variables 
light curves of two morphologies: sine-like curves with sym- 
metric ascending and descending slopes (type A) and asymmet- 
ric light curves (type B). Figure [8] shows light curves from the 
Hipparcos catalogue as examples of both new types of pulsat- 
ing morphologies. The reason for this noninformative classifi- 
cation is that morphological information alone is not enough to 
separate the different classes of pulsating stars. As mentioned 
above, this is the subject of ongoing research. The exact statis- 
tics of the pulsating stars light curves used for training are given 
in Tabled 

In this case, the output bias was given a Gaussian distri- 
bution of zero mean and variance given by a hyperparameter 
taken from a gamma distribution of mean 0.05 and shape pa- 
rameter 0.5. This was also the case for the (rescaled) hidden-to- 
output weights and the hidden biases. Input-to-hidden weights 
were given a higher level of hyperparameters: their values were 
taken from a Gaussian distribution of zero mean and a hyper- 
parameterized variance; this variance in turn was given for all 
such weights of a given input unit by a gamma distribution of 
shape parameter 0.5 and mean given by the overall gamma dis- 
tribution (common for all input units) of mean 0.2 and shape 
parameter 0.5. This hierarchical scheme introduces dependen- 
cies in the values of the weights connecting a given input unit 
to the hidden layer, thus allowing automatic (implicit) pruning 
of the input unit, if it does not add a significant improvement 
to the performance of the net. Finally as mentioned above, a 
softmax model is used in which the probability that the input 
light curve x is of class k is defined in terms of the output of 
the network as 



P(C = C k \x) = 



exp(/ A . (*)) 
Ik'fk'ix) ' 



(9) 



with the output of neuron k' that represents class Ck> ■ 

The integral in Eq. Q is approximated by the sum of 200 
terms generated by the Markov Chain MonteCarlo method af- 
ter the distribution is let stabilize in 800 initial steps. Visual in- 
spection of the evolution of the error and weights confirms that, 
even before 600 initial steps, the method attains an equilibrium 
distribution. The errors reported in the next section have the 
same statistical properties if computed with the last 300 or 400 
networks generated. 
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HIP 8163 SUBCLASS #A 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 



9.80 r 
10.00 
10.20 
10.40 
10.60 
10.80 
11.00 
11.20 
11.40 



HIP 6029 SUBCLASS #B 



0.1 0.2 0.3 0.4 0.5 0.6 



Fig. 8. Two examples of the new classes used to separate pulsating star light curves from binary systems. 

Table 2. Number of light curves of each class used in the training set. Type includes eccentric systems from the Hipparcos 
catalogue plus 1 12 synthetic light curves (see text). 



Type 


Type 1 


Type 2 


Type 3 


Type 4 


Type A 


Type B 


32(+112) 


269 


164 


192 


129 


131 


693 



Table 3. Number of light curves of each class of pulsation used in the training set (pulsation class taken from the Hipparcos 
catalogue). 





a Cygni 


P Cephei 


Cepheids 


W Virginis 


S Cepheids 


S Scuti 


Mira 


RR Lyrae 


Type A 


5 


26 


2 


2 


20 


51 


19 


6 


TypeB 


11 


20 


17 


23 


226 


36 


182 


178 
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Table 4. Relevant physical parameters of systems classified as class 1 in the text. 



Name 


P{d) 


q 


Comp. 


Spec. 


M 


R 


T c ir 


log(L) 


My 


ref. 


HD 


Vmax 


a(Re) 




Type 


CM©) 


(Rg) 


(K) 


L( G ) 






BW Aqr 


6.72 


0.931 


A 


F7V 


1.488 


2.064 


3.803 


0.79 


2.74 


1,2 


BD -16° 6074 


10.33 


21.298 


B 


F5V 


1.386 


1.788 


3.810 


0.70 


2.98 




V539 Ara 


3.17 


0.851 


A 


B3V 


6.254 


4.432 


4.260 


3.29 


-1.70 


1,3,4,5 


HD 161783 


5.71 


20.53 


B 


B4V 


5.326 


3.734 


4.230 


3.02 


-1.04 




EM Car 


3.41 


0.936 


A 


08V 


22.89 


9.34 


4.531 


5.02 


4.56 


1,6 


HD 97484 


8.38 


33.72 


B 


08V 


21.43 


8.33 


4.531 


4.92 


4.31 




GL Car 


2.42 


0.962 


A 


B8V 


13.5 


4.99 


4.476 


5.02 


-2.97 


7 


HD 306168 


8.38 


22.60 


B 


B8V 


13.0 


4.74 


4.468 


4.92 


-2.83 




QX Car 


4.48 


0.915 


A 


B2V 


9.267 


4.289 


4.377 


3.72 


-2.32 


1,4,8 


HD 86118 


6.64 


29.81 


B 


B2V 


8.480 


4.051 


4.354 


3.58 


-2.07 




SZ Cen 


4.11 


0.982 


A 


A7V 


2.317 


4.554 


3.875 


1.77 


0.29 


1,9,10 


HD 120359 


8.48 


17.94 


B 


A7V 


2.277 


3.624 


3.892 


1.64 


0.61 




CW Cep 


2.73 


0.893 


A 


B0.5V 


13.52 


5.685 


4.452 


4.27 


-3.17 


1,11,12 


HD 218066 


7.59 


24.217 


B 


B0.5V 


12.08 


5.177 


4.442 


4.15 


-2.94 




EK Cep 


4.43 


0.553 


A 


A1.5V 


2.029 


1.579 


3.954 


1.17 


1.89 


1,11,13,14,15 


HD 206821 


7.87 


16.63 


B 


G5Vp 


1.124 


1.315 


3.756 


0.21 


4.31 




RZCha 


2.83 


0.994 


A 


F5V 


1.518 


2.264 


3.810 


0.90 


2.46 


1,10,16 


HD 93486 


8.10 


12.17 


B 


F5V 


1.509 


2.264 


3.810 


0.90 


2.46 




V442 Cyg 


2.39 


0.901 


A 


FIV 


1.564 


2.072 


3.839 


0.94 


2.35 


1,17 


HD 334426 


9.72 


10.81 


B 


F2V 


1.410 


1.662 


3.833 


0.72 


2.89 




VI 143 Cyg 


7.64 


0.968 


A 


F5V 


1.391 


1.346 


3.810 


0.45 


3.60 


1,10,11 


HD 185912 


5.86 


22.82 


B 


F5V 


1.347 


1.323 


3.806 


0.42 


3.67 




DIHer 


10.55 


0.874 


A 


B5V 


5.185 


2.680 


4.230 


2.73 


-0.46 


1,18 


HD 175227 


8.42 


43.18 


B 


B5V 


4.534 


2.477 


4.179 


2.46 


4.05 




RXHer 


1.78 


0.847 


A 


B9 


2.75 


2.44 


4.015 


1.79 


0.48 


10,11,19 


HD 170757 


7.26 


10.62 


B 


AO 


2.33 


1.96 


3.985 


1.48 


1.12 




Al Hya 


8.29 


0.922 


A 


F2m 


2.145 


3.914 


3.826 


1.44 


1.10 


1,20 


BD +0°2259 


9.36 


27.630 


B 


FOV 


1.978 


3.850 


3.851 


1.24 


1.61 




TZMen 


8.57 


0.604 


A 


A0V 


2.487 


2.016 


4.017 


1.63 


0.93 


1,21 


HD 39780 


6.19 


27.94 


B 


A8V 


1.504 


1.432 


3.857 


0.69 


2.97 




UX Men 


4.18 


0.967 


A 


F5V 


1.238 


1.347 


3.792 


0.38 


3.81 


10,22,23 


HD 37513 


8.22 


14.68 


B 


F8V 


1.198 


1.274 


3.789 


0.32 


3.96 




V451 Oph 


2.20 


0.848 


A 


B9V 


2.776 


2.640 


4.033 


1.93 


0.34 


1,10,11,24 



Continued on the next page... 
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continued from previous page. 



Name 


P(d) 


1 


Comp. 


Spec. 


M 


R 


Tat 


log(L) 


M v 


ref. 


HD 


v m „ 


a(R s ) 




Type 


(Me) 


(Rg) 


(K) 


L(g) 






HD 170470 


7.87 


12.27 


B 


AOV 


2.356 


2.028 


3.991 


1.53 


1.11 




V1031 Ori 


3.41 


0.936 


A 


A6V 


2.473 


4.321 


3.895 


1.80 


0.18 


1,25 


HD 38735 


6.02 


33.727 


B 


A3V 


2.286 


2.977 


3.924 


1.60 


0.74 




AlPhe 


24.59 


0.966 


A 


KOIV 


1.236 


2.930 


3.700 


0.69 


3.24 


1,26 


HD 6980 


8.61 


47.830 


B 


F7V 


1.195 


1.816 


3.800 


0.67 


3.07 




Zeta Phe 


1.67 


0.649 


A 


B6V 


3.930 


2.851 


4.163 


2.51 


4.59 


1,3,10,27 


HD 6882 


3.95 


11.039 


B 


B8V 


2.551 


1.853 


4.076 


1.79 


0.91 




VI 647 Sgr 


3.28 


0.900 


A 


AIV 


2.189 


1.831 


3.982 


1.41 


1.35 


1,28 


HD 163708 


6.94 


14.93 


B 


AIV 


1.972 


1.666 


3.959 


1.23 


1.73 




V760 Sco 


1.73 


0.927 


A 


B4V 


4.980 


3.013 


4.228 


2.82 


4.71 


1,29 


HD 147683 


6.99 


12.88 


B 


B4V 


4.620 


2.640 


4.210 


2.63 


4.24 




CV Vel 


6.89 


0.982 


A 


B2.5V 


6.100 


4.087 


4.253 


3.19 


-1.48 


1,30 


HD 77464 


6.69 


34.96 


B 


B2.5V 


5.996 


3.948 


4.250 


3.15 


-1.38 





References used in Table H l :lAndersenl ll99ll). 2: lcTausenHl99lft. 3: lAndersenl Jl983ft 4: Be Grevd jl989l) , 5: Iciausenl |l99rj) . 
6: lAndersen & Clausenl Jl989ft, 7: iGimenez & Clausenl ll98fift . 8: lAndersen et alJ Jl983bft. 9: iGronbech et alJ fi977ft , 10: IPoroeJ Jl98(t) . 
11: iFracastorcl ll972l), 12: IClausen & Gimenezl Il99ll., 13 : IPonpeJ Jl98l). 1 4: iMartm & Rebolol j 1993ft. 15: IClaret e7aihl995l) . 
16: IJorgensen & GvldenkerrieNl 975ft. 17: Lacv & FruehHl987ft 18: IPoDt)eJh982ft. 1 9: I.Teffrevsl il98ol. 20: iKhaliullin & Kozvrevd I|l989ft , 
21: lAndersen etai]<1987l) , 22: IClausen & Gronbechlil97f3). 23: lAndersen et alj<1989ft. 24: Iciausen et alJ<1986t). 25: lAndersen et al]jl99Cl) , 
26: lAndersen et ail ll988ft . 27: IClausen et all < 1976ft . 28: IClausen et al]<1977ft . 29: lAndersen et ail ll985l) . 30: IClausen & Gronbechl iT977ft 
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Table 5. Relevant physical parameters of systems classified as class 2 in the text. 





r \u ! 


<? 


Comp. 


Spec. 


M 


R 


T ,r 




My 


ref. 


HD 


v 






lype 














RY Aqr 


1.9666 


0.204 


A 


A3 


1.27 


1.28 


3.881 


0.700 


2.9 


1,2 


HD 203069 


9.06 


7.61 


B 




0.26 


1.79 


3.658 


0.100 


4.4 




IM Aur 


1.2473 


0.249 


A 


B9 


4.73 


3.24 


4.199 


2.770 


-2.2 


1,3,4 


HD 33853 


7.70 


8.81 


B 




1.18 


2.20 


3.881 


1.160 


1.8 




R CMa 


1.1359 


0.131 


A 


F2V 


1.52 


1.73 


3.849 


0.76 


2.77 


3,5 


HD 57167 


4.5730 


5.48 


B 


G8IV 


0.20 


1.18 


3.712 


-0.41 


6.36 




RZ Cas 


1.195 


0.330 


A 


A3V 


2.21 


1.67 


3.934 


1.12 




3,6,7 


HD 17138 


6.2 


6.79 


B 




0.73 


1.94 


3.672 


0.16 






TV Cas 


1.8126 


0.464 


A 


B9V 


2.80 


2.81 


4.029 


1 .970 


-0.2 


1,3,7,8 


HD 1486 


10.57 


10.01 


B 


G5-9IV 


1.30 


3.15 


3.708 


0.780 


2.7 




U CrB 


3.4522 


0.300 


A 


B6V 


4.70 


2.60 


4.185 


2.520 


-1.6 


1,3,7,9,10 


HD 136175 


7.65 


17.57 


B 


GOIII-IV 


1.41 


4.91 


3.764 


1.390 


1.2 




SW Cyg 


4.5730 


0.220 


A 


A2V 


2.27 


2.43 


3.957 


1.550 


0.8 


1 1 


HD 191240 


9.30 


16.28 


B 


KOIV 


0.50 


4.15 


3.690 


0.950 


2.3 




AF Gem 


1.2435 


0.342 


A 


B9.5V 


3.37 


2.61 


4.00 


1.78 




7,12 


HD 210892 


10.54 


8.04 


B 


GOIII-IV 


1.155 


2.32 


3.768 


0.75 






AQ Peg 


5.5485 


0.256 


A 


A2 


2.34 


2.64 


3.959 


1.630 


0.6 


1,11 


BD+12°4653 


10.39 


18.887 


B 




0.60 


4.89 


3.644 


0.910 


2.4 




AT Peg 


1.1461 


0.472 


A 


A4V 


2.22 


1.86 


3.924 


1.19 


1.76 


1,3,7,13 


HD 210892 


9.50 


6.84 


B 




1.05 


2.15 


3.690 


0.38 


4.1 




AW Peg 


10.6225 


0.160 


A 


AlVe 


2.06 


1.90 


3.959 


1.350 


1.3 


1,3,11,14 


HD 207956 


7.40 


27.18 


B 


F5IV 


0.33 


6.12 


3.602 


0.930 


2.4 




DM Per 


2.7277 


0.314 


A 


B5V 


5.82 


3.96 


4.202 


2.960 


-2.6 


1,3,15,16 


HD 14871 


7.88 


16.18 


B 


A5III 


1.83 


4.60 


3.914 


1.940 


-0.1 




RY Per 


6.8636 


0.281 


A 


B3V 


6.60 


4.00 


4.246 


3.140 


-3.2 


1,17,18 


HD 17034 


8.48 


30.96 


B 


F6IV 


1.86 


8.53 


3.814 


2.070 


-0.5 




Bet Per 


2.8673 


0.221 


A 


B8V 


3.70 


2.90 


4.097 


2.250 


-0.9 


1,3,19,20 


HD 19356 


2.12 


14.04 


B 


G8-KOIII 


0.82 


3.50 


3.708 


0.860 


2.5 




USge 


3.3806 


0.333 


A 


B8V 


5.70 


4.05 


4.146 


2.750 


-2.2 


1,9,18,21 


HD 181182 


8.20 


18.63 


B 


G4III 


1.90 


5.38 


3.724 


1.310 


1.4 




XZSgr 


3.2756 


0.137 


A 


A3V 


1.89 


1.46 


3.945 


1.060 


2.0 


1,22 


HD 168710 


0.00 


11.98 


B 


G5IV 


0.26 


2.47 


3.708 


0.570 


3.3 




TXUMa 


3.2756 


0.137 


A 


B8V 


4.76 


2.83 


4.111 


2.30 




1,3,9,23 


HD 93033 


7.06 


11.98 


B 


GOIII-IV 


1.18 


4.24 


3.740 


1.17 
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Table 6. Relevant physical parameters of systems classified as class 3 in the text. 



Nuinc 


ryu) 


<? 


— p 

omp. 


— = 

T peC ' 


M 


R 


T ,T 

1 I'll 




My 


ref. 


HD 


Vmax 






lype 




\ K o) 




T I \ 






CX Aqr 


0.5559 


0.537 


A 


F5 


1.19 


1.29 


3.806 


2.70 




1,2 




10.70 


3.48 


B 


G9 


0.64 


1.15 


3.696 


0.72 






EE Aqr 


0.5089 


0.322 


A 


F0-F2 


2.20 


1.75 


3.881 


0.88 




1,3,4 


HD 213683 


8.30 


3.83 


B 




0.71 


1.07 


3.643 


-0.42 






IU Aur 


1.8115 


0.676 


A 


BOV 


21.3 


7.5 


4.505 


4.73 


-6.8 


5,6,7 


HD 35652 


8.19 


20.58 


B 


B0.5V 


14.4 


7.2 


4.449 


4.46 


-6.3 




TT Aur 


1.3327 


0.648 


A 


B2V 


8.58 


4.06 


4.373 


3.664 


-4.5 


8,9 


HD 33088 


8.53 


12.32 


B 




5.56 


4.17 


4.267 


3.264 


-3.5 




DO Cas 


0.6847 


0.313 


A 


A 


1.69 


2.10 


3.96 


1.42 




1,5,10,11 




8.60 


4.26 


B 




0.53 


1.20 


3.68 


-0.16 






YY Cet 


0.79 


0.510 


A 


A8 


1.84 


2.09 


3.875 


1.10 




1,12 


BD -18° 349 


5.05 


10.00 


B 




0.94 


1.63 


3.725 


0.30 






AI Cm 


1.4177 


0.611 


A 


B2IVe 


10.30 


4.95 


4.384 


3.880 


-4.9 


13 


-60.3723 


9.20 


13.54 


B 


B4 


6.30 


4.43 


4.248 


3.240 


-3.3 




V836 Cyg 


0.6534 


0.333 


A 


A3 


2.4 


1.96 


4.00 


1.04 




1,10,14 


HD 203470 


8.59 


4.67 


B 


G 


0.80 


1.24 


3.76 


4.32 






RZDra 


0.5508 


0.442 


A 


A5 


1.40 


1.62 


3.911 






1,15 




10.00 


3.57 


B 


K2 


0.62 


1.12 


3.690 








RU Eri 


0.6322 


0.420 


A 


F3V 


2.45 


2.06 




1.07 




1,10 


HD 24658 


9.90 


4.69 


B 




1.03 


1.43 




-0.03 






u Her 


2.0510 


0.38 


A 


B2IV 


7.60 


5.80 


4.301 


3.680 


-4.5 


5,6,16,17 


HD 156633 


4.77 


14.87 


B 


B8III 


2.90 


4.40 


4.065 


2.490 


-1.5 




TT Her 


0.9121 


0.435 


A 


F2V 


1.56 


2.30 


3.960 


1.13 




1,10,18,19 


BD +17°3117 


5.17 


9.70 


B 




0.68 


1.49 


3.744 


-0.02 






RS Ind 


0.6240 


0.310 


A 


F1V 


2.00 


2.00 


3.857 


0.98 




1,3,20 




9.90 


4.23 


B 


G8 


0.62 


1.18 


3.668 


-0.23 






FTLup 


0.470 


0.426 


A 


F2V 


1.43 


1.43 


3.826 






1,21,22 


132316 


9.7 


3.23 


B 


K5-7V 


0.61 


0.94 


3.639 








V Pup 


1.4550 


0.522 


A 


Bl 


14.86 


6.18 


4.450 


4.340 


-6.1 


5,6,23,24 


HD 65818 


4.41 


15.28 


B 


B3 


7.76 


4.90 


4.425 


4.040 


-5.3 




CX Vir 


0.7461 


0.336 


A 


F5 


1.07 


1.85 




0.75 




1,25 


123660 


9.20 


3.90 


B 


K 


0.36 


1.12 




-0.31 
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Table 7. Relevant physical parameters of systems classified as class 4 in the text. 



Name 


P(d) 


1 


Comp. 


Spec. 


M 


R 




log(L) 


ref. 


HD 


V miiX 


a(R e ) 




Type 


(Mo) 


(Re) 


(K) 


L(g) 




OO Aql 


0.507 


0.888 


A 


G5V 


1.19 


1.44 


5700 


1.97 


1,2,3,4 


HD 187183 


9.20 


4.570 


B 




1.34 


1.00 


5635 


1.62 




V535 Ara 


0.629 


0.582 


A 


A8V 


2.18 


2.10 


8750 


3.17 


1,5 


HD 159441 


7.40 


4.667 


B 




1.27 


0.79 


8572 


7.86 




AO Cam 


0.329 


0.766 


A 




1.03 


0.98 


5520 


0.80 


3,6 


BD +52°826 


9.50 


2.452 


B 


0.88 


0.79 




5826 


0.80 




V523 Cas 


0.233 


0.569 


A 


K4 


0.79 


0.75 


4207 


0.16 


3,7,8 






1.714 


B 




0.58 


0.45 


4407 


0.12 




V677 Cen 


0.325 


0.481 


A 




1.06 


1.19 


5745 


1.39 


3,6 




11.55 


2.312 


B 




0.51 


0.15 


5841 


0.27 




V752 Cen 


0.370 


0.575 


A 


F8V 


1.20 


1.24 


6210 


2.06 


3,6 


HD 101799 




2.681 


B 




0.69 


0.36 


6234 


0.65 




VY Cet 


0.341 


0.666 


A 




1.02 


1.01 


5393 


0.77 


9 


BD -20° 345 


11.10 


2.449 


B 




0.83 


0.68 


5610 


0.61 




CC Com 


0.221 


0.518 


A 




0.79 


0.41 


4302 


0.17 


2,3,10 




11.00 


1.634 


B 




0.54 


0.73 


4500 


0.11 




EK Com 


0.267 


0.580 


A 




0.93 


0.92 


5000 


0.47 


11 




12.70 


1.981 


B 




0.54 


0.28 


5310 


0.20 




FS Cra 


0.264 


0.755 


A 




0.86 


0.82 


4567 


0.26 


3,10 




13.80 


1.984 


B 




0.73 


0.65 


4700 


0.23 




YYEri 


0.322 


0.693 


A 


G5 


1.01 


1.02 


5389 


0.79 


1,2,3,13 


HD 26609 


8.80 


2.361 


B 




0.70 


0.44 


5585 


0.43 




SY Hor 


0.312 


0.659 


A 




0.97 


0.95 


4934 


0.47 


3,9 




11.40 


2.266 


B 




0.83 


0.64 


5240 


0.47 




V508 Oph 


0.345 


0.527 


A 


G5 


1.01 


1.06 




0.087 


14 


BD +13°3496 


10.00 


2.444 


B 




0.52 


0.80 




-0.286 




BBPeg 


0.362 


0.405 


A 


F8 


1.16 


1.21 


5883 


1.58 


3,15 




10.80 


2.512 


B 




0.78 


0.47 


6200 


0.81 




UPeg 


0.375 


0.579 


A 


G2V 


1.33 


1.28 


5515 


2.80 


2,3,16 


BD +15°4915 


9.70 


2.800 


B 




0.77 


0.44 


5800 


1.28 




AEPhe 


0.362 


0.401 


A 


G1/G2V 


1.17 


1.19 


6000 


1.63 


3,13 


HD 9528 


8.30 


2.521 


B 




0.79 


0.47 


6145 


0.79 




YZ Phe 


0.234 


0.597 


A 




0.87 


0.79 


4800 


0.30 


17 




12.50 


1.786 


B 




0.52 


0.35 


5055 


0.16 




FG Set 


0.271 


0.781 


A 




0.87 


0.73 


4662 


0.29 


3,10 




13.70 


2.036 


B 




0.68 


0.83 


4800 


0.25 




RZTau 


0.416 


0.369 


A 


A7V 


1.57 


1.51 


7200 


5.51 


1,2,18,19 


HD 285892 


10.50 


3.024 


B 




1.00 


0.58 


7146 


2.34 




BP Vel 


0.265 


0.722 


A 




0.90 


0.86 


4717 


0.33 


20 




12.90 


2.009 


B 




0.65 


0.48 


5000 


0.23 




BI Vul 


0.252 


0.686 


A 




0.86 


0.82 


4549 


0.26 


3,10 






1.898 


B 




0.70 


0.59 


4600 


0.20 




WUMa 


0.334 


0.731 


A 


F8V:p 


1.08 


1.10 


5800 


0.87 


1,2,3 


HD 83950 


8.30 


2.505 


B 




0.79 


0.51 


6194 


0.60 




AAUMa 


0.468 


0.547 


A 


GO 


1.26 


1.40 


5932 


2.17 


3,6 




11.30 


3.168 


B 




1.10 


0.69 


6030 


1.43 




AW UMa 


0.439 


0.349 


A 




1.52 


1.60 


7175 


6.06 


1,2,18 


HD 99946 


7.27 


3.221 


B 




0.53 


0.11 


6875 


0.56 




RZPyx 


0.656 


0.821 


A 


B7V 


5.76 


2.69 


4.230 


2.73 


21 


HD 75920 


8.85 


6.954 


B 




4.73 


2.51 


4.225 


2.65 
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